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Abstract 

We introduce an alternative formulation of the exact stochastic simulation algorithm (SSA) for 
sampling trajectories of the chemical master equation for a well-stirred system of coupled chemical 
reactions. Our formulation is based on factored-out, partial reaction propensities. This novel exact 
SSA, called the partial propensity direct method (PDM), is highly efficient and has a computational 
. \ cost that scales at most linearly with the number of chemical species, irrespective of the degree 

. of coupling of the reaction network. In addition, we propose a sorting variant, SPDM, which is 

O \ especially efficient for multiscale reaction networks. 
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^ ; 1 Introduction 

I In chemical kinetics, the temporal evolution of a well-stirred system of chemically reacting molecules 
is classically described using reaction rate equations. Reaction-rate equations are a mean-field descrip- 
tion formulated as coupled ordinary diff'erential equations. The number of molecules is continuous in 
time, and reaction rates are quantified using macroscopic rate constants. Such reaction rate equa- 
^ ' tions, however, do not always provide an accurate description. This is the case especially, but not 
■ only, when the number of molecules of the various chemical species (henceforth called the population) 
is much smaller than Avogadro's number [1, 2]. At low population, the number of molecules is not 
large enough for fluctuations to be negligible. In addition, fluctuations may play an important role in 
the kinetics [1, 3]. Even at high population, correlated fluctuations can cause the mean to behave in 
a way that is not captured by a mean-fleld description [2, 4, 5]. These effects can be accounted for 
by stochastic kinetic models, which can incorporate thermal fluctuations in a number of ways. An 
approach that has become canonical is the chemical master equation (CME) [6, 7, 8], a Markov-chain 
model with many applications in physics, chemistry, and biology. The CME models the kinetics of any 
chemical reaction system that is well stirred and thermally equilibrated [8]. Its high dimensionality, 
however, renders analytical approaches intractable. 
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Numerical methods to sample trajectories from the CME mostly rely on kinetic Monte Carlo ap- 
proaches [9]. The canonical kinetic Monte Carlo approach for sampling a trajectory of the CME is 
Gillespie's stochastic simulation algorithm (SSA) [6, 7, 8]. SSA is governed by the joint probability 
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density 

p{T,ii\n{t)) = {ae~<'^){aJa) (1) 

for the random variables r (the time to the next reaction) and /i (the index of the next reaction). The 
vector n{t) = (ni, . . . ,71^) is the population at time t. Each entry rii is the number of molecules of 
the respective species Sj, and N is the total number of species. The propensity of each reaction ji is 
defined as = c^h^, where is the specific probability rate, and /i^ = ^''fii^) is the reaction degen- 
eracy, which is the number of possible combinations of reactants in reaction ^ given the population 
n. The reaction propensity is such that a^^di is the probability that a randomly selected combination 
of reactant molecules of reaction /x at time t will react in the next infinitesimal time interval dt. The 
total propensity is a = J2^=i where M is the total number of reactions. 

Existing SSA formulations can be classified into exact and approximate methods. Exact methods sam- 
ple from the probability density in Eq. 1. These formulations include the direct method (DM) [6, 8], 
the first reaction method (FRM) [6], Gibson-Bruck's next-reaction method (NRM) [10], a Gibson- 
Bruck variant of the DM [10], the optimized direct method (ODM) [11], the sorting direct method 
(SDM) [12], the logarithmic direct method (LDM) (unpublished, [13]), and the composition-rejection 
formulation (SSA-CR) [14]. Approximate SSA formulations provide better computational efficiency 
for large numbers of molecules by sampling from an approximation to the probability density in 
Eq. 1. These methods include r-leaping [15, 16, 17, 18], fca-leaping [15], i?-leaping [19], L-leap [20], 
K-le&p [21], the slow-scale method [22], and implicit r-leaping [23]. 

In this paper we focus on exact methods. They offer the advantage of being parameter-free, whereas 
all approximate methods contain parameters that need to be adjusted by the user. The computational 
cost of exact SSA formulations is dominated by the steps needed to sample the next reaction and up- 
date the propensities after a reaction has fired [10, 11, 14]. In Gillespie's original DM and FRM, this 
leads to a computational cost that scales linearly with the number of reactions in the network. Various 
improved SSA formulations have been proposed in order to reduce this computational cost. The most 
notable improvements include the use of dependency graphs to reduce the number of propensities 
that need to be updated [10], and various samphng schemes of higher efficiency [10, 11, 12, 14]. All 
of these sampling schemes can be interpreted as instances of the random-variate generation problem 
[14] as described in Devroye's compendium [24] and can reduce the computational cost (CPU time) 
of sampling the next reaction. These improvements have reduced the computational cost of SSA to 
logarithmic or even constant scaling for weakly coupled networks. For strongly coupled networks, how- 
ever, the computational cost of all improved SSA formulations still scales linearly with the number of 
reactions. We define weakly coupled networks as those where the maximum number of reactions that 
are influenced by any other reaction, i.e. the maximum degree of coupling of the network, is indepen- 
dent of system size. This is in contrast to strongly coupled networks, where the number of influenced 
reactions grows proportionally with system size and can be as large as the total number of reactions. 
In such networks, the total number of reactions grows faster than the number of species when the 
latter is increased. Strongly coupled networks frequently occur, e.g., in nucleation-and-growth models, 
scale-free biochemical networks, and colloidal aggregation systems. In these cases, the scaling of the 
computational cost of most of the improved SSA formulations with system size is equivalent to that 
of DM (see Sec. 2). 

We present a novel SSA formulation with a computational cost that scales at most linearly with the 
number of species, making it especially efficient for strongly coupled networks. This is made possible 
by restricting the class of systems to networks containing only elementary chemical reactions, where 
every reaction has at most two reactants [8] . This allows factoring out one of the species from every re- 



2 



action propensity, leading to partial propensities that depend on the population of at most one species. 
Any non- elementary reaction can always be broken down into elementary reactions, at the expense of 
an increase in system size [8, 25, 26]. The use of partial propensities leads to SSA formulations with 
a computational cost that scales as some function of the number of species rather than the number of 
reactions. 

In Sec. 3, we formally introduce the concept of partial propensities and present two partial propen- 
sity variants of the exact SSA: the partial-propensity direct method (PDM) and the sorting partial- 
propensity direct method (SPDM). They use partial propensities and efficient data structures for 
sampling the next reaction and for updating the partial propensities after a reaction. We benchmark 
them in Sec. 4 and show that their computational cost scales at most linearly with the number of 
species in the network, irrespective of the degree of coupling. The benchmarks include two strongly 
coupled networks, for which the degree of coupling grows with system size, a weakly coupled reaction 
network with a constant maximum degree of coupling, and a small, fixed-size biological multiscale 
(stiff) network. In order to test the competitiveness of our algorithm in cases where several other SSA 
formulations might be more efficient, we choose the most weakly coupled network possible, the linear 
chain model, where the number of reactions scales linearly (with a proportionality constant of 1) with 
the number of species [27, 11]. The multiscale biological network is included in order to benchmark the 
new algorithms on small systems and when the reaction propensities span several orders of magnitude. 
In Sec. 5 we summarize the main results, discuss the limitations of the presented method, and give an 
outlook on possible future developments and applications. 

2 Computational cost of previous exact SSA formulations 

We review the scaling of the computational cost of previous exact SSA formulations. In order to 
express scaling with system size x, we use the Bachmann-Landau notation, writing C{x) G 0{f{x)) 
{C{x) is 0{f{x))) whenever C{x) > is bounded from above by f{x) as C{x) < af{x), for all x and 
some constant pre- factor a > that is independent of x. 

Since DM and FRM form the basis for most exact SSA's, we first focus on these two. DM's compu- 
tational cost is 0{M) [6, 8, 10, 11, 14], where M is the total number of reactions (see also Appendix 
A). In FRM, the sampling strategy for /i and r is different (see Appendix A). This, however, docs 
not change the scaling of the computational cost of FRM, which remains 0{M). Since the FRM 
sampling strategy involves discarding M — 1 reaction times, its computational cost generally has a 
larger pre- factor than that of DM [6, 10, 11]. 

NRM is an improvement over FRM in which the M — 1 unused reaction times are suitably reused, 
and data structures such as indexed priority queues and dependency graphs are introduced. The 
indexed priority queue, which is equivalent to a heap tree, is used to sort the r^'s more efficiently; the 
dependency graph is a data structure that contains the indices of the reactions whose propensities 
are to be recomputed after a certain reaction /j, has fired. This avoids having to recompute all a^^s 
after every reaction. Each reaction is represented as a node in the dependency graph, and nodes i 
and j are connected by a directed edge if and only if the execution of reaction i affects the propensity 
(through the population of reactants) of reaction j. These data structures, together with the reuse 
of reaction times, reduce the computational cost of NRM to 0(fclog2M), where k is the out-degree 
of the dependency graph, that is, the degree of coupling of the reaction network. In strongly coupled 
networks. A; is a function of M and is 0{M). The computational cost of NRM is thus 0{M) for 
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strongly coupled networks. Even for some weakly coupled networks, the computational cost of NRM 
has been empirically shown to be 0{M) [11]. This is due to the additional overhead, memory-access 
operations, and cache misses introduced by the complex data structures (indexed priority queue, de- 
pendency graph) of NRM. The scaling of the computational cost of the Gibson-Bruck variant of DM 
is equal to that of NRM, albeit with a larger pre- factor [10]. For weakly coupled networks where 
k{M) is 0(1), independent of system size, the computational cost is further reduced to 0(1) in the 
SSA-CR formulation [14] under the assumption that the ratio of maximum to minimum propensity is 
bounded. For strongly coupled networks, where k{M) is 0(M), the computational cost of SSA-CR is 
0(M) [14, 28]. 

ODM is an improvement over DM where the reactions are sorted in descending order of firing fre- 
quency. This makes it more probable to find the next reaction close to the beginning of the list and, 
hence, reduces the search depth for finding the index of the next reaction using linear search. ODM 
estimates the firing frequencies of all reactions during a short prc-simulation run of about 5-10% of 
the length of the entire simulation [11, 12]. In order to reduce the cost of updating the propensities 
after a reaction has fired, ODM also uses a dependency graph. Irrespective of the degree of coupling, 
the computational cost of ODM is 0(M), which was confirmed in benchmarks [11]. SDM is a variant 
of ODM that does not use pre-simulation runs, but dynamically shifts up a reaction in the reaction 
list whenever it fires ( "bubbling up" the more frequent reactions) . This further reduces the pre- factor 
of the computational cost of SDM compared to that of ODM, but the scaling remains 0{M) [12]. 

LDM uses a binary search tree (recursive bisection) on an ordered linear list of cumulative sums of 
propensities to find the next reaction. This is reported to reduce the average search depth of this step 
to 0(log2 M) [13]. Irrespective of the degree of coupling, however, the update step is 0{M) since on 
average (M + l)/2 sums of propensities need to be recomputed, rendering the computational cost of 
LDM 0{M). 

In summary, the computational cost of previously reported exact SSA formulations is 0{M) for 
strongly coupled networks. For weakly coupled networks, however, some are significantly more efficient 
and can be 0(log2 M) or even 0(1). 

3 Partial-propensity methods 

We introduce the concept of partial propensities for elementary reactions and use it to formulate two 
partial-propensity direct methods, PDM and SPDM, whose computational cost scales at most linearly 
with the number of species, even for strongly coupled networks. SPDM uses concepts from SDM [12] 
to dynamically rearrange reactions, which reduces the average search depth for sampling the next 
reaction in a multiscale network. 

We define the partial propensity of a reaction with respect to one of its reactants as the propensity 
per molecule of this reactant. For example, the partial propensity vr/j of reaction n with respect to 
(perhaps the only) reactant S^ is an/rii, where is the propensity of reaction /x and is the number 
of molecules of S^. The partial propensities of the three elementary reaction types are: 

• Bimolecular reactions (Sj -|- Sj Products): = niUjCfj, and vr^*'' = n^c^, vr^^^ = riiC^. 
If both reactants are of the same species, i.e. Sj = Sj, only one partial propensity exists, Trji^ = 
^{rii — l)c^ because the reaction degeneracy is ^ni{ni — 1). If = 0, the partial propensity 
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becomes negative. As explained in the caption of Fig. 1 this, however, does not require any 
special treatment. 

• Unimolecular reactions (S^ Products): = riiC^ and Trji^ = C/j,. 

• Source reactions (0 — > Products): = and tt^^ = c^. 

We consider only these elementary reaction types since any reaction with three of more reactants can 
be treated by decomposing it into a combination of elementary reactions [8, 25, 26]. 



3.1 The petrtial-propensity direct method (PDM) 

In PDM, the index of the next reaction jj, is sampled in a way that is algebraically equivalent to that 
of DM, as shown in Appendix B. The major novelties in PDM are the use of partial propensities 
and efficient data structures that reduce the number of operations needed to sample ^ and to update 
the partial propensities. The time to the next reaction is sampled as in DM. We first present the 
main principles behind the new sampling and update schemes and then describe them in detail. The 
complete algorithm is given in Table 1. 



3.1.1 Main principles behind PDM 

PDM uses partial propensities and groups them in order to efficiently sample the index of the next 
reaction and update the partial propensities after a reaction has fired. For the sampling step, the 
partial propensities are grouped according to the index of the factored-out reactant, yielding at most 
N + 1 groups of size 0{N). Sampling then proceeds in two steps: we first sample the index of the 
group before sampling the actual partial propensity inside that group. This grouping scheme reduces 
the number of operations needed for sampling the next reaction using a concept that is reminiscent 
of two-dimensional cell lists [29]. If all partial propensities are in the same group, or if every group 
contains only a single partial propensity, the sampling step of PDM is no more efficient than that of 
DM. These cases, however, can only occur if the function M{N) is 0{N) (see for example the linear 
chain model) and both PDM and DM hence have a computational cost of 0{N) for sampling the 
index of the next reaction. 

After the selected reaction has been executed, we use a dependency graph over species (partial propen- 
sities), rather than reactions, to find all partial propensities that need to be updated. This is possible 
because partial propensities depend on the population of at most one species, and is analogous to 
a Verlet list [30]. This limits the number of updates to be 0{N). In addition, partial propensities 
of unimolecular reactions are constant and never need to be updated. In weakly coupled networks, 
where the degree of coupling is 0(1), the scaling of the computational cost of the update becomes 
equal to that of methods that use dependency graphs over reactions, such as SSA-CR, ODM, and SDM. 

We illustrate the sampling scheme of PDM in a simple protein aggregation example. Consider pro- 
teins that aggregate to form at most tetramcric complexes. There arc = 4 species in the reaction 
network: monomers, dimers, trimers, and tetramers. All species except tetramers can aggregate in 
all possible combinations to form multimeric complexes (4 bimolecular reactions). In addition, all 
multimeric complexes can dissociate into any possible combination of two smaller units (4 unimolec- 
ular reactions) and monomers are constantly produced (1 source reaction). This reaction network 
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is described by M = 9 partial propensities (vr['^'*), {tt^\ "^3^^ ^4^^)> ("^5^^ ^6^^)' (^7^^)) ^g'^'*)- 
Grouping the partial propensities according to the index of the factored-out reactant given in the 
superscript, we obtain 5 (= A'" + 1) groups as indicated by the parentheses. Along with each group, 
we store the sum of all partial propensities inside it. Using a random number, we sample the group 
that contains the next reaction, before finding the corresponding partial propensity inside that group. 

Assume that in our example reaction 7 is to fire next. The search depth to find the group index 

(3) 

is 4 and we need 1 additional operation to find the partial propensity {tTj ). PDM thus requires 5 
operations to sample the next reaction in this network of 9 reactions. The average search depth of 
sampling the next reaction in this example is 37/9 « 4.1. 



In the next section, we formally describe PDM and its data structures. 



3.1.2 Detailed description of the PDM algorithm 

All partial propensities are stored in the "partial-propensity structure" 11 = {Iljlj^Q as a one-dimensional 
array of one-dimensional arrays Ilj. Each array Ilj contains the partial propensities belonging to group 
i. The partial propensities of source reactions are stored as consecutive entries of the O*'^ array Ho. 
The partial propensities of all reactions that have species Si as one of its reactants are stored as 
consecutive entries of IIi. In general, the i^^ array Xlj contains the partial propensities of all reactions 
that have Sj as a reactant, provided these reactions have not yet been included in any of the previous 

(i) 

nj<j. That is, out of the two partial propensities of a reaction /x with Sj and Sj as its reactants, tt^i^ 
is part of Xlj if i < j, and not stored anywhere. Notice that, since the different Ilj's can be 

of different length, storing them as an array of arrays is more (memory) efficient than using a matrix 
(i.e. a two-dimensional array). The reaction indices of the partial propensities in 11 are stored in a 
look-up table L = {Lj}^Q, which is also an array of arrays. This makes every reaction /x identifiable 
by a unique pair of indices, a group index / and an element index J, such that the partial propensity 
of reaction /x = L/^j is stored in II/^j. 



We further define the "group-sum array" A, storing the sums of the partial propensities in each group 
n^, thus Ai = j].n 



of all groups, as Sj 



n. 



^ = 0,...,N. In addition, we also define S, the array of the total propensities 
jAj, i = 1, . . . ,N, and Sq = ^o- The total propensity of all reactions is then 
a = '^^^ °^ ^ avoids having to recompute the sum of all partial propensities in Ilj after 

one of them has changed. Rather, the same change is also applied to Aj and computing the new 
Sj only requires a single multiplication by nj. Using these data structures and a single uniformly 
distributed random number ri G [0, 1), the next reaction /x can efficiently be sampled in two steps: 
(1) sampling the group index / such that 



/ = min 



I' : na<J2^i 



i=Q 



(2) 



and (2) sampling the element index J in 11/ such that 

J = min J' : ria 



j=l \i=o / 



(3) 



(See Appendix B for a proof of the equivalence of this sampling scheme to that of DM.) Using the 



6 



temporary variables 



i=0 



Via — $ + E/ 
ni 



(4) 



Eq. 3 can be efficiently implemented as 



J' 



J = min 



J' : *<^n,,. 



(5) 



The indices / and J are then translated back to the reaction index /x using the look-up table L, thus 



Once a reaction has been executed, n, 11, A, and XI need to be updated. This is efficiently done using 
three update structures: 

is a array of M arrays, where the i array contains the indices of all species involved in the i 
reaction. 

is a array of M arrays containing the corresponding stoichiometry (the change in population of 
each species upon reaction) of the species stored in U^^^ . 

is a array of N arrays, where the i*^ array contains the indices of all entries in 11 that depend 
on rii, thus: 



^1 ^ ~ (^I'il ^2'i2 

^(3) ^ < ^2 = {iiJi ilJi ■■■) 

tt(3) _ (-N -N -N -N 

When a reaction is executed, the populations of the species involved in this reaction change. Hence, 
all entries in 11 that depend on these populations need to be updated. After each reaction, we use 
U^^^ to determine the indices of all species involved in this reaction. The stoichiometry is then looked 
up in U^^^ and the population n is updated. Subsequently, U^^^ is used to locate the affected entries 
in n and recompute them. The two data structures U^^^ and U^^^ are a sparse representation of 
the stoichiometry matrix, and U^'^^ represents the dependency graph over species. Since the partial 
propensities of unimolecular and source reactions are constant and need never be updated, U^"^) only 
contains the indices of the partial propensities of bimolecular reactions. The size of U(3) is at most 
a factor of N smaller than that of the corresponding dependency graph over reactions, since partial 
propensities depend on the population of at most one species. Figure 1 summarizes the data structures 
used in PDM for an example reaction network. The complete algorithm is given in Table 1. Overall, 
PDM's computational cost is 0{N) and its memory requirement is 0{M), irrespective of the degree 
of coupling (see Appendix C). 



(6) 
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3.2 The sorting pEirtial-propensity direct method (SPDM) 

The sorting partial-propensity direct method (SPDM) is the partial-propensity variant of SDM [12]. 
In SPDM, the group and element indices / and J are bubbled up whenever the reaction jj, = L/_j fires. 
The reordered indices are stored in an array for I, and an array of arrays of the size of 11 for the J's. 
This requires an additional N + M memory, but further reduces the search depth to sample the next 
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reaction, especially in a multiscale (stiff) network. The computational cost of SPDM is also 0{N) 
(see Appendix C), but with a possibly reduced pre-factor. 

4 Benchmarks 

We benchmark the computational performance of PDM and SPDM using four chemical reaction net- 
works that are prototypical of: (a) strongly coupled reaction networks, (b) strongly coupled reaction 
networks comprising only bimolecular reactions, (c) weakly coupled reaction networks, and (d) mul- 
tiscale biological networks. The first two benchmarks consider strongly coupled networks where the 
degree of coupling scales with system size (see column "degree of coupling" in Table 2). The first 
benchmark consists of a colloidal aggregation model. The second benchmark considers a network of 
only bimolecular reactions, where none of the partial propensities are constant. In the third bench- 
mark, we compare PDM and SPDM to SDM on the linear chain model, a weakly coupled reaction 
network with the minimal degree of coupling, for which SDM was reported to be very efficient [11, 12]. 
The fourth benchmark considers the heat-shock response model, a small multiscale (stiff) biological 
reaction network of fixed size. The benchmark problems are defined in detail in Appendix D, where 
also the respective partial-propensity structures 11 are given. 

All tested SSA formulations are implemented in C++ using the random-number generator of the GSL 
library and compiled using the GNU C++ compiler version 4.0.1 with the 03 optimization flag. All 
timings are determined using a nanosecond-resolution timer (the iiiach_absolute_time() system call) 
on a MacOS X 10.4.11 workstation with a 3 GHz dual-core Intel Xeon processor, 8 GB of memory, 
and a 4MB L2 cache. For each test case, we report both the memory requirement and the average 
CPU time per reaction (i.e. per time step), O. 6 is defined as the CPU time (identical to wall-clock 
time in our case) needed to simulate the system up to final time T, divided by the total number of 
reactions executed during the simulation, and averaged over independent runs. The time Q does not 
include the initialization of the data structures (step 1 in Table 1) as this is done only once and is not 
part of the time loop. 

We explain the benchmark results in terms of the computational cost of the individual steps of the 
algorithms. We distinguish three steps: (a) sampling the index of the next reaction, (b) updating the 
population, and (c) updating the partial propensities (for PDM and SPDM) or the propensities (for 
SDM). The computational costs of these steps are quantified separately and the overall timings are 
then explained as a weighted sum of: 

• C/^: The number of operations required to sample the index of the next reaction (for PDM, this 
is step 2 in Table 1). 

• Cn- The number of elements of the population n that need to be updated after executing a 
reaction (for PDM, this is step 4 in Table 1). 

• Cp: The number of (partial) propensities that need to be updated after executing a reaction (for 
PDM, this is step 5.2.2 in Table 1). 

The expressions for these elementary costs are given in Table 3 as determined by independently fitting 
models for the scaling of the algorithms to the measured operation counts, averaged over 100 inde- 
pendent runs of each test problem. In all cases, the models used for the computational cost explain 
the data with a correlation coefficient of at least 0.98. The benchmark results are then explained by 
fitting the weights of the cost superposition aC^ + 6C„ + cCp to the measured scaling curves @{N) 
using the expressions given in Table 3. In order to preserve the relative weights of the data points. 



8 



all fits arc done on a linear scale, even though the results are plotted on a logarithmic scale for two 
of the benchmarks. All these fits also have a correlation coefficient of at least 0.98. Explaining the 
timing results as a superposition of elementary costs allows determining which part of an algorithm 
is responsible for a particular speedup or scaling behavior, and what the relative contributions of the 
three algorithmic steps are to the overall computational cost. 

The memory requirements of the algorithms are reported in Table 4 for all benchmark cases. These 
numbers were derived analytically from the size of the individual data structures. 

4.1 Strongly coupled reaction network: colloidal aggregation model 

We use the colloidal aggregation model in Appendix DD.l [31, 32, 33, 34, 35] as a first example of a 
strongly coupled reaction network. This reaction network can be used to model, e.g., colloidal aggre- 
gation of solvated proteins, nano-beads, or viruses. For A'' chemical species it consists of M = [^J 
reactions and the maximum out-degree of the dependency graph is 3N — 7 and hence scales with 
system size (see Table 2). 

The colloidal aggregation model is simulated up to time T = 100 with specific probability rates 
Cn,m = 1 and Cp^g = 1. At time t = 0, m = N6i.i. The scaling of 9 for PDM, SPDM, and SDM 
with system size is shown in Fig. 2(a), averaged over 100 independent runs. 0^^^'^ and Q^pdm ^^.^ 
O(Ar0-5) fQj. gj^aii (iggg ^han about 100) and 0{N) for large N. 0^°^ is 0{N^). The pre-factor of 
qSPDM jg similar to that of 0^°^, since in this network is not significantly reduced by the dynamic 
sorting (Table 3). The memory requirements of PDM and SPDM are 0{N^) = 0{M), that of SDM 
is 0{N^) = 0{NM) (Table 4). 

In summary, the computational costs of both PDM and SPDM are 0{N). This scaling is mediated by 
all three cost components. The use of partial propensities renders the scaling of the sampling cost 
0{N) (see Table 3). The cost Cp for updating the partial propensities is 0{N^'^) (Table 3), since the 
use of partial propensities allows formulating a dependency graph over species, rather than reactions, 
and unimolecular reactions have constant partial propensities. This leads to a smaller number of 
updates needed as shown in Fig. 3(a). 

4.2 Strongly coupled network of bimoleculEir reactions 

The network in Appendix DD.2 consists of M = ^i-^ " strongly coupled bimolecular reactions, 
such that none of the partial propensities are constant. Both the minimum and the maximum out- 
degrees of the dependency graph in this case are AN — 10, scaling faster with N than in the previous 
case (see Table 2). 

We simulate this network up to time T = 0.001 with all specific probability rates q = 1. At t = 0, 
Hi = 100(5Ar_4,i + 5N-3,i + SN-2,i + SN-i,i + ^Af.i)- The scaling of 6 for PDM, SPDM, and SDM with 
system size is shown in Fig. 2(b), averaged over 100 independent runs, e^^^ and eSPDM 

are 0{N), 

whereas 9^°^^ is 0{N'^). The pre-factors of PDM and SPDM are comparable. The memory require- 
ments of PDM and SPDM are 0{N^) = 0{M), that of SDM is 0{N^) = 0{NM) (see Table 4). 
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In summary, the computational costs of PDM and SPDM are 0{N) for this strongly coupled, purely 
bimolecular network. The scaling is again mediated by all three cost components. Grouping the 
partial propensities renders the sampling cost 0{N) (see Table 3). Because none of the partial 
propensities are constant, the update costs Cp of PDM and SPDM are 0{N), as in SDM, albeit with 
a pre-factor that is ~ 2.5 times smaller than that in SDM. One reason for this smaller pre-factor is 
the smaller number of updates needed upon reactions firing, as shown in Fig. 3(b). This is due to the 
fact that partial propensities of bimolecular reactions depend on the population of only one species, 
which reduces the number of combinations that need to be updated. 

4.3 Wecikly coupled reaction network: linecir chain model 

We benchmark PDM and SPDM on a weakly coupled model in order to assess their limitations in 
cases where other SSA formulations might be more efficient. We choose the linear chain model defined 
in Appendix DD.3 since it is the most weakly coupled reaction network possible and it has been used 

as a model for isolated signal transduction networks [27]. For M reactions, it involves the minimum 
number of species N = M + 1, and the maximum out-degree of the dependency graph is constant at 
the minimum possible value of 2 (see Table 2), since every reaction at most influences the population 
of its only reactant and of the only reactant of the subsequent reaction. 

We simulate the linear chain model to a final time of T = 1000 with all specific probability rates 
Ci = 1. At time t = 0, rii = 10000(^1.;. Figure 2(c) presents the scaling of the CPU time with system 
size for PDM, SPDM, and SDM, averaged over 100 independent runs. 0^^^^ scales linearly with A'^ 
and -y^rith AT^-^. -g 0{N) with a pre-factor that is more than 4 times larger than that 

of OP^'^. This difference in pre-factor is mainly caused by PDM having smaller Cn and Cp (Table 
3). C^, however, scales worse for PDM than for SDM due to the dynamic sorting in SDM. This is 
overcome in SPDM, where is 0{N^'^), as in SDM. The memory requirements of SPDM and PDM 
are 0{N) = 0{M), that of SDM is 0{N^) = 0{NM) (Table 4). 

In summary, the computational costs of PDM and SPDM on the weakly coupled linear chain model 

are governed by (a) updating the population n using a sparse stoichiomctry representation and (b) 
never needing to update the partial propensities of unimolccular reactions. Since the linear chain 
model contains only unimolecular reactions, none of the partial propensities ever needs to be updated, 
leading to an update cost of Cp = (see Table 3). While we have implemented SDM according to 
the original publication [12], we note that if one uses a sparse representation of the stoichiomctry 
matrix also in SDM, point (a) vanishes and Cn = 2 also for SDM. A sparse-stoichiometry SDM would 
thus have the same scaling of the computational cost on the linear chain model as would SPDM, 
outperforming PDM. 

4.4 Multi-scale biological network: heat-shock response in Escherichia coli 

We assess the performance of PDM and SPDM on a small, fixed-size multiscale reaction network. We 
choose the heat-shock response model since it has also been used to benchmark previous methods, 
including ODM [11] and SDM [12]. The model describes one of the mechanisms used by the bacterium 
E. coli to protect itself against a variety of environmental stresses that are potentially harmful to the 
structural integrity of its proteins. The heat-shock response (HSR) system reacts to this by rapidly 
synthesizing heat-shock proteins. The heat-shock sigma factor protein cr^^ activates the HSR by in- 
ducing the transcription of heat-shock genes. The heat-shock response model is a small multiscale 
reaction network (the specific probability rates span 8 orders of magnitude) with N = 28 chemical 
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species, M = 61 reactions, and a maximum out-degree of the dependency graph of 11 (see Table 2). 
For a detailed description of the model, we refer to Kurata et al. [36] 

We simulate the HSR model for T = 500 seconds. During this time, approximately 46 million reactions 
are executed. For a single run, we measure 

@PDM ^ 0.256^8 and Q^^^ = 0.272 /xs. This corresponds 
to a simulated 3.68 million reactions per second of CPU time for SDM and 3.89 million reactions 
per second for PDM. Hence, PDM is about 6% faster than SDM. This speed-up is mainly due to a 
smaller Cp in PDM (see Fig. 3(c) for the distribution of updates over all reactions) since the partial 
propensities of unimolecular reactions never need to be updated. The speed-up is, however, modest 
because of PDM is 4.6 times larger than that of SDM (Table 3). This is due to the fact that 95% 
of all reaction firings are caused by a small subset of only 6 reactions. This multiscale network thus 
strongly benefits from the dynamic sorting used in SDM. This advantage can be recovered in SPDM, 
where is comparable to that of SDM, and _ 0.245 /xs (4.08 million reactions per second). 

This makes SPDM 11% faster than SDM on this small network. 

5 Conclusions and Discussion 

The stochastic simulation algorithm (SSA) [6, 7, 8] is widely used for computational stochastic reaction 
kinetics in chemistry, physics, biology, and systems biology. It is included in most existing stochastic 
simulation software packages and is standard in courses on computational chemical kinetics. Due to 
this importance, several variants of the original SSA formulation have been published that reduce 
the computational costs of the sampling and update steps. When simulating weakly coupled reaction 
networks, where the maximum number of reactions that arc influenced by any reaction is constant 
with system size, the computational cost of the sampling step has been reduced to be 0(log2 M) [10], 
where M is the total number of reactions, and even to 0(1) under some conditions for the propensity 
distribution [14]. Using dependency graphs, also the update step has been reduced to be 0(1) for 
weakly coupled networks [11, 12, 14]. For strongly coupled reaction networks, where the degree of 
coupling increases with system size and can be as large as the total number of reactions, all previous 
exact SSA formulations have a computational cost that is 0{M). 

We have introduced a new quantity called partial propensity and have used it to construct two novel 
formulations of the exact SSA: PDM and its sorting variant SPDM. Both are algebraically equivalent 
to DM and yield the same population trajectories n(t) as to those produced by DM. In our formula- 
tion of partial propensities, we have limited ourselves to elementary chemical reactions. Since their 
partial propensities depend on the population of at most one species, both new SSA formulations 
have a computational cost that scales at most linearly with the number of species rather than the 
number of reactions, independently of the degree of coupling. This is particularly advantageous in 
strongly coupled reaction networks, where the number of reactions M grows faster than the number 
of species N with system size. On networks of fixed size, PDM and SPDM are especially efficient 
when M ^ N. PDM's computational cost is 0{N), which is made possible by appropriately grouping 
the partial propensities in the sampling step and formulating a dependency graph over species rather 
than reactions in the update step. Moreover, the partial propensities of unimolecular reactions and 
source reactions are constant and never need to be updated. This further reduces the size of the 
dependency graph and the computational cost of the update step. To our knowledge, PDM is the first 
SSA formulation that has a computational cost that is 0{N), irrespective of the degree of coupling of 
the reaction network. In the case of multiscale networks, the computational cost of SPDM is smaller 
than that of PDM. 
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We have benchmarked PDM and SPDM on four test cases of various degrees of coupling. The first two 
benchmarks considered strongly coupled networks, where the degree of coupling scales proportionally 
to the number of species. The third benchmark considered the most weakly coupled network possible, 
where several other SSA formulations might be more efficient. Finally, the fourth benchmark consid- 
ered a small biological multiscale network. These benchmarks allowed estimating the scaling of the 
computational cost with system size and the cost contributions from reaction sampling, population 
update, and partial-propensity update. The results showed that (a) the overall computational costs of 
PDM and SPDM are 0{N), even for strongly coupled networks, (b) on very weakly coupled networks, 
SPDM is competitive compared to SDM, (c) on multiscale networks SPDM outperforms PDM, and 
(d) the memory requirements of PDM and SPDM are 0(M) in all cases, and hence not larger than 
those of any other exact SSA formulations. 

Currently, PDM and SPDM have a number of limitations. The most important limitation is that the 

presented formulation of partial propensities is only applicable to elementary chemical reactions. Any 
higher-order chemical reaction can always be broken down into elementary reactions at the expense 
of increasing system size. In applications such as population ecology or social science, the idea of 
partial propensities can, however, only be used if the (generalized) reactions are at most binary and 
one species can be factored out, i.e. if the propensity for every reaction between species S, and Sj 
can be written as = c^nih{nj). Besides this structural limitation, the computational performance 
of the particular algorithms presented here can be limited in several situations. One of them is the 
simulation of very small networks, where the overhead of the data structures involved in PDM and 
SPDM may not be amortized by the gain in efficiency and a simulation using DM may be more effi- 
cient. In multiscale networks, where the propensities span several orders of magnitude, PDM is slower 
than SPDM. In multiscale networks where a small subset (-C A'^) of all reactions accounts for almost 
all of the reaction firings, however, the overhead of the data structures involved in SPDM, including 
their initialization, may not be amortized by the gain in efficiency. Finally, PDM and SPDM were 
designed to have a computational cost that scales linearly with the number of species rather than the 
number of reactions. In reaction networks in which the number of reactions grows sub-linearly with 
the number of species, this becomes a disadvantage. In such cases, SSA formulations that scale with 
the number of reactions are favorable. 

The classification of reaction networks according to their "difficulty" is still largely an open question. 
Besides system size, degree of coupling, and multiscaling (spectrum of time scales), there might also 
be other network properties that influence the computational cost of the various SSA formulations. 
Automatized selection of the most efficient SSA formulation for a given network would require both a 
systematic classification of networks and a prediction of the computational cost of SSA formulations 
based on network properties. This might require a more detailed cost analysis of the algorithms and 
a set of standard benchmark problems that are designed to cover the entire range of performance- 
relevant parameters. 

Taken together, our results suggest that PDM and SPDM can potentially offer significant perfor- 
mance improvements especially in strongly coupled networks, including the simulation of colloidal 
aggregation [31, 32, 33, 34, 35], Becker-Doring-like nucleation-and-growth reactions [37], and scale- 
free biochemical reaction networks, where certain hubs are strongly coupled [38, 39, 40, 27]. Finally, 
the use of partial propensities is not limited to exact SSA formulations, and we also expect approx- 
imate methods to benefit from it. The software implementations of PDM and SPDM will be made 
available as open source on the web page of the authors. 
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A The original SSA algorithms 

Gillespie's direct method (DM) consists of the following steps: 

1. Set t 0; initialize n, a^yfx, and a 

2. Sample /x: generate a uniform random number ri G [0, 1) and determine /x as the smallest integer 
satisfying n < a^i/a (see Eq. 1) 

3. Sample r: generate a uniform random number r2 G [0, 1) and compute the real number r as 
r = —a~^ ln(r2) (see Eq. 1) 

4. Update: n <— n + i/^, where is the stoichiometry of reaction /x; recompute all and a 

5. t <— f + r; go to step 2 

The first reaction method (FRM) uses a different sampling strategy for /j, and r as follows: r = 
min[{ri, r2, . . . , tm}] and is the index of the smallest r. The probability density of the time to the 
i^^ reaction, Tj, is given hj = aje""'"^*. 

B Algebraic equivalence of PDM's sampling scheme to that of Gille- 
spie's direct method 

In the direct method (DM), the next reaction index is sampled as 



where ri is a uniform random number G [0, 1) and a„,, is the propensity of reaction m. Without loss 
of generality, we identify /i' by a unique pair of indices, /' and such that /i' = L//_j/. Using this 
mapping to a group (row) index /' and an element (column) index J', Eq. 7 becomes 




(7) 



m=l 





J'-l J' 



= mm 



(8) 



i=0 Vj j=l 



such that n = L/^j. This can be written for the group (row) index / alone 



I' 




(9) 



1=0 Vj 
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and the element (column) index J alone 



J = min 



i-i J' 

i=0 Vi j=l 



(10) 



Using the definitions for Sj and Ilj, Eqs. 9 and 10 are equivalent to Eqs. 2 and 3, respectively. 



C Computational cost and memory requirement of PDM and SPDM 
C.l Computational cost 

The computational cost of PDM is governed by the following steps: (a) sampling the index of the next 
reaction and (b) updating the population n and the partial-propensity structure 11. The computational 
cost of SPDM is the same as that of PDM. 



Computational cost of sampling the index of the next reaction. For any chemical reaction 
network with N species, the number of arrays in the partial-propensity structure 11 is at most -|- 1, 
which is also the maximum length of S) and A. The number of entries in each array Ilj is at most 
2N, since any species can react with at most N species in bimolecular reactions and undergo at most 
N unimolecular reactions. Sampling the index of the next reaction involves two steps: (a) a linear 
search for the group index / in XI and (b) a linear search for the element index J in 11/. Since 5] is at 
most of length A'' -|- 1, the first step is 0(N). The second step is also 0{N), since no Ilj can be longer 
than 2N. The overall computational cost of sampling the next reaction is thus 0{N) for networks of 
any degree of coupling. 



Computational cost of the update. Let the maximum number of chemical species involved in 
any reaction (as rcactants or products) be given by the constant s (constant with system size). The 
computational cost of updating n is thus s G 0(1). In PDM, only the partial propensities of bi- 
molecular reactions need to be updated. The total number of entries in the third update structure 
U*^^^ is, thus, equal to the number of bimolecular reactions. In addition, the total number of entries 
in n that depend on any nj is always less than or equal to N, as any species Sj can only react 
with itself and the remaining — 1 species in bimolecular reactions. Therefore, the upper bound for 
the total number of partial propensities in 11 to be updated after executing any reaction is sN G 0{N). 

In summary, the computational cost of PDM is 0{N), irrespective of the degree of coupling in the 
reaction network (see Table 3 for benchmark results). 



C.2 Memory requirement 

The memory requirement of PDM is given by the total size of the data structures n, 11, L, A, XI, 
UW,U(2),and U(3). 



The partial-propensity structure 11 and the look-up tabic L have the same size. Since every reaction 
is accounted for exactly once, each structure requires 0{M) memory. A, n, and S arc all at most 
of length N + 1 and thus require 0{N) memory. The sizes of U*-^-' and U*-^-* are 0{M), and the size 
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of U(^) is proportional the number of bimolecular reactions and, hence, 0{M) if all reactions are 
bimolecular. 



In summary, the memory requirement of PDM is 0(M). SPDM requires an additional N + M memory 
to store the reordered index lists (see Table 4). 



D Benchmark problem definitions 



D.l Colloidal aggregation model 



The reaction network of the colloidal aggregation model is defined by: 



n = 1, 



; m = n, . . . , N — n 



Sq + Sp-q p=l,...,N; q = l,..., 



(11) 



For an even number of species N, the partial-propensity structure for this network is: 

(0) 



n = <^ 



Ho 
Hi 

n2 



2 



. Cl,l^\^ Ci,2n2 Ci,3n3 
. C2,l C2,2^^ C2,3n3 . 



c, jvnjv 

'2 2 

Co N_nN_ 
^'2 2 



CN_ 1 CJV 9 
2 ' 2 ' 



Cjv jv Cjv 

2 ' 4 2 ' 



Hat = (^cat,! CAr,2 cn,3 



For odd iV, the structure looks similar. 



D.2 Network of bimoleculcir reactions 



2 2 / 

5n_ n_ ) 

2 ' 2 / 



Cl,N-inN 
C2,N~2nN 



(12) 



The network of bimolecular reactions is given by: 



ra=l,...,iV-l; m = n + l,...,iV; 

p = min [{1, . . . , A^}\{n, m}] ; q = min [{1, . . . , N}\{n, m,p}] 



(13) 



The partial-propensity structure for this reaction network is: 

r no = (0) 

III = (Cl,2n2 Ci,3n3 Ci,4n4 
n2 = (C2,3n3 02,4^4 £2,5^5 



n = <^ 



■ C2,NnN) 



njv_i = (civ_i,ivniv) 
= (0) . 



(14) 
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D.3 LineEir chain model 



The reactions of the hneax chain model are given by: 

Si^ Si+i i = l,...,iV-l, (15) 



and the partial-propensity structure is: 



n= < 



r no = (0) 

Hi = (ci) 

n2 = (C2) 



(16) 



[ Hjv = (0) . 



D.4 Heat-shock response model 

The heat-shock response model [36] was obtained from Dr. Hong Li and Prof. Linda Petzold (UCSB) 
and is publicly available as part of the StochKit package [41] . 
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Figure 1: (a) Illustration of the data structures in PDM for the example reaction network shown in 
(b). Note that there may be arrays Ilj, i = 1,. . . ,N, containing at most one negative entry if the 
corresponding rii = 0. Indeed, in this example, 112,1 < and A2 < if n2 = 0. This, however, poses 
no problem in sampling / and J as all for which = are zero and hence the corresponding group 
indices / are never selected. 
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Figure 2: Computational costs of PDM (circles), SPDM (diamonds), and SDM (squares). See main 
text for the simulation parameters and initial conditions used. The average CPU time per reaction 
(i.e. per time step), 6, is shown as a function of system size quantified by the number of species 
N . 6 is defined as the CPU time needed to simulate the system up to final time T, divided by 
the number of reactions executed during this time, and averaged over 100 independent runs (er- 
ror bars are smaller than symbol size). The solid lines are the corresponding least-squares fits of 
the scahng Q(N) of PDM, SPDM, and SDM with the model aC^ + hC^ + cCp on a linear scale (see 
Table 3), where a, 6, and c are the fitted constants, (a) Logarithmic plot of the results for the 
colloidal aggregation model. The fits are: eP^'^Z/is = 0.0022Ar + 0.050A^°-5 + 0.22, Q^^^^ j = 
0.0027iV + 0.053A^°-5 + 0.20, and O^^^Z/xs = O.OOOSlAf^+O.OlSA^+O.Sl. (b) Logarithmic plot of the 
results for the network of bimolecular reactions. The fits are: Q^^^/i^s = 0.038N, Q^^^^/i^s = 
0.039Ar, and Q^^^/iis = 0.00061iV2+0.027Ar+0.15. (c) Linear plot of the results for the linear chain 
model. The fits are: Q^^^/fis = 0.000657V+0.19, Q^^^^/fis = 0.00157V°-5+0.20, and Q^^^/fis = 
0.0029Af-0.0025A^0-5+0.15. In all cases, the computational cost e(A^) of PDM and SPDM is 0{N). 
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Figure 3: Measured distributions of the number of partial propensities (for PDM and SPDM, red line) 
and propensities (for SDM, blue line) that need to be updated after firing any reaction of: (a) the 
colloidal aggregation model, (b) the network of bimolecular reactions, and (c) the heat-shock response 
model. Dots indicate medians, horizontal bars the upper and lower quartiles, and vertical bars the 
upper and lower extrema (maximum and minimum). The dotted lines denote the minimum, average 
and maximum degree of coupling k of the reaction networks (see Table 2). The number of updates in 
SDM [12] using a dependency graph is governed by the degree of coupling of the network. In PDM 
and SPDM, less updates need to be performed since partial propensities depend on the population of 
at most one species and are constant for unimolecular reactions. 
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Tables 



1. Initialization: set t <— 0; initialize n, 11, A, XI; a <— J2^o ^ generate L, \J^^\ U*^^\ 
and U(3) 

2. Sample /i: generate a uniform random number ri G [0, 1) and determine the group index / and 
the clement index J according to Eqs. (2), (4), and (5); <— L/^j 

3. Sample r: generate a uniform random number r2 G [0, 1) and compute the time to next reaction 
T as T «— a~^ln(r^^) 

4. Update n: for each index k of U^^'* , I ^ U^^^ and n; ^ n; + U^^^ 

5. Update 11, A, S and compute Aa, the change in a: 
For each index k of U^"*"^ , do: 

5.2. For each index m of do: 

5.2.1. (4, j^) ^ U[^'2^ (Eq. 6) 

5.2.2. U^i .1 ^U^i .1 + c^ujfi, if//4 
Hji ji <— n,.; n + \cu\][ \,, if i=iL, 

5.2.3. A,,^ ^ A,._ if ¥4 

5.2.4. Stemp ^ Sj!^ 

5.2.5. S.-i <— TT.,-! A,-! 

5.2.6. Aa ^ Aa + T^i^ — Etcmp 

5.3. Aa ^ Aa + n^A; — S^; ^ n^A^ 

6. Update a and increment time: a ^ a + Aa; Aa <— 0; t <— t + r 

7. Go to step 2 

Table 1: Detailed algorithm for the partial-propensity direct method PDM. 
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Tabic 2: Properties of the benchmark cases. The number of species, number of reactions, and mini- 
mum, average, maximum out-degree of the dependency graph (degree of coupling) are given for the 
benchmark cases defined in Appendix D: the colloidal aggregation model (CA), the network of bi- 
molecular reactions (NB), the linear chain model (LC), and the heat-shock response model (HSR). (*) 
In the linear chain model the degree of coupling is 1 only for the last reaction, since its product is not 
a reactant anywhere else. 
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Tabic 3: Number of compute operations needed by the different algorithms (PDM, SPDM, SDM) for 
the different test cases (CA: colloidal aggregation model; NB: network of bimolecular reactions; LC: 
linear chain model; HSR: heat-shock response model). is the average number of operations needed 
to sample the next reaction /x. Cn is the average number of entries in the population n that need to 
be updated after any reaction. Cp is the average number of partial propensities (or propensities for 
SDM) that need to be updated after any reaction. The operation counts are averaged over all reactions 
executed during 100 independent runs of each benchmark over the range of N shown in Fig. 2. The 
average numbers are then fitted with the models given here (with correlation coefficient of at least 
0.98 in all cases). See Fig. 3 for the distribution of the number of updates. 
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Total 



0(Ar2) ^ Q(^) 
0(Ar2) ^ Q(^) 
0(iV) = 0(M) 
557 



SDM 



n 



N 
N 
N 
28 



c, a 



N(N-l) 
2 

- 1 

Gl 



dependency graph 



1.2Ar3 - 2.5Ar2 + 2.3A^ 
2N^ - 7N'^ + 5N 
2{N-1) 

3G0 



N 



JV^(iV-l) 
2 

N{N - 1) 

;I708 



Total 



0(iV3) = 0{NM) 
0{N^) = 0{NM) 
0{N'^) = 0{NM) 



2218 



Table 4: Total amount of computer memory needed by the different algorithms (PDM, SPDM, SDM) 
for the different test cases (CA: colloidal aggregation model; NB: network of bimolecular reactions; 
LC: linear chain model; HSR: heat-shock response model). The sizes of all major data structures 
(c and a are the arrays of specific probability rates and reaction propensities, respectively; v is the 
stoichiometry matrix; see Sec. 33.1 for other definitions) as well as the total memory requirements 
are given as determined analytically for all benchmark simulations. SPDM and SDM need additional 
memory of size M + N and M, respectively, for the reordered index lists. This, however, does not 
change the overall scaling of the total memory requirements. 
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